Search CORE

102 research outputs found

An update on statistical boosting in biomedicine

Author: Gefeller Olaf
Hepp Tobias
Hofner Benjamin
Mayr Andreas
Schmid Matthias
Waldmann Elisabeth
Publication venue
Publication date: 01/01/2017
Field of study

Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine-learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Open Access LMU

Extending the inferential capabilities of model-based gradient boosting algorithms

Author: Hepp Tobias
Publication venue
Publication date: 01/01/2019
Field of study

Hintergrund und Ziele Die rasante technologische Entwicklung der vergangenen Jahrzehnte ermöglichte nicht nur die praktische Anwendung zuvor lediglich theoretischer Konzepte der statistischen Datenanalyse sondern führte auch zu einer Vielzahl neuer, zunehmend rechenintensiven Analysestrategien aus dem Umfeld des maschinellen Lernens. Weiterentwicklungen der erfolgreichen Boosting-Algorithmen offenbarten deren Nähe zu bekannten statistischen Konzepten und machten diese für die Schätzung regularisierter Regressionsparameter additiver Modelle nutzbar. Die vorliegende Arbeit richtet den Fokus auf die daraus resultierenden Modelleigenschaften sowie deren Verbesserung und Erweiterung bezüglich inferenzstatistischer Validität und Interpretierbarkeit. Methoden Alle vorgestellten Ansätze beziehen sich auf unterschiedliche Formen modellbasierter Boosting-Algorithmen. Diese starten bei der Initialisierung mit einem leeren Nullmodell, welches in den nachfolgenden Iterationen schrittweise durch wiederholte Anwendung von Regressionsfunktionen sequentiell erweitert wird um schließlich ein additives Modell zu bilden. Die vorliegende Arbeit untersucht die daraus resultierenden Schätzer und Modelleigenschaften zunächst im Vergleich mit anderen Regularisierungsmethoden wie

L_1

-Penalisierung. Darüber hinaus werden alternative Strategien zur Verbesserung der Variablenselektion des Gesamtmodells vorgeschlagen sowie alternative Teststrategien zur Überprüfung einzelner Effekte entwickelt. Dabei wird vornehmlich auf Varianten der Variablenpermutation und Bootstrapping-Methoden zurückgegriffen. Ergebnisse Die Regularisierung linearer Effektschätzer mittels modellbasierten Gradientenboostings verhält sich im Falle diagonaler Dominanz der inversen Kovarianzmatrix der Prädiktorvariablen mit sinkender Schrittlänge

\nu

asymptotisch zur

L_1

-Penalisierung. Unterschiede zwischen den Verfahren lassen sich auf die sequentielle Aggregation des Boosting-Modells zurückführen, wodurch zwar einerseits die Regularisierungspfade stabilisiert werden, andererseits aber die Modelle tendenziell mehr Variablen aufnehmen. Um eine Vielzahl falsch positiver Selektionen zu vermeiden, kann über die Erweiterung der Daten um permutierte Varianten der Prädiktorvariablen der Fokus von der Prognosegüte auf die Variablenselektion gelenkt werden. Residuenpermutation und parametrischer Bootstrap ermöglichen die Berechnung von p-Werten, die in niedrigdimensionalen Szenarien die gleiche Power erreichen wie Wald-Tests für Maximum-Likelihood-Schätzer. Praktische Schlussfolgerungen Die Ergebnisse dieser Arbeit bieten eine Entscheidungshilfe bei der Wahl zwischen Boosting und

L_1

-Penalisierung als Regularisierungsmethode für statistische Modelle. Zudem wird die Anwendbarkeit modellbasierter Gradient-Boosting-Algorithmen in Situationen verbessert, in denen die weiterführende Interpretation der selektierten Variablen von zentralem Interesse ist. Zum Einen lässt sich die Genauigkeit der Variablenselektion durch alternatives Tuning mittels permutierter Variablen erhöhen. Darüber hinaus erlaubt die Verwendung des parametrischen Bootstraps erstmals die Berechnung von

p

-Werten für einzelne Effektschätzer modellbasierter Gradient-Boosting Algorithmen in hochdimensionalen Szenarien mit korrelierten Prädiktorvariablen.Background and aims The rapid development of computer technology in recent decades has not only enabled the practical application of previously merely theoretical ideas of statistical data analysis but has also led to a multitude of new and increasingly computationally intensive analysis strategies emerging from the field of machine learning. Further developments of the successful boosting algorithms revealed their relationship to known statistical concepts and made them usable for the estimation of regularized regression parameters of additive models. This thesis focuses on the resulting model properties as well as their improvement and extension with regard to inferential statistical validity and interpretability. Methods All presented approaches address various forms of model-based boosting algorithms. The algorithm is initialized with an empty model, which is sequentially updated in the following iterations by repeated application of small regression functions to build a final additive model. This thesis examines the resulting estimators and model properties in comparison with other regularization methods such as

L_1

-penalization. In addition, alternative strategies for improving the variable selection properties of the overall model are proposed and strategies for testing individual effects are developed. For this purpose, variants of variable permutation and bootstrapping methods are developed. Results Regularization of linear effect estimators by means of model-based gradient boostings exhibits asymptotic behaviour to

L_1

-penalization with decreasing learning rate

\nu

if and only if the inverse covariance matrix of the predictor variables is diagonally dominant. Differences between the methods can be traced back to the sequential aggregation of the boosting model, which stabilizes the regularization paths but makes the models relatively larger. Therefore, in order to avoid a large number of false positive selections, the focus can be shifted from the prediction to variable section accuracy by extending the dataset with permutations of the predictor variables. Residual permutation and parametric bootstrap allow the computation of p-values with test power on par with Wald-tests for maximum likelihood estimators in low-dimensional scenarios. Practical conclusions The results of this work provide a guideline for the choice between boosting and

L_1

-penalty as regularization method for statistical models. In addition, the applicability of model-based gradient-boosting algorithms is improved in situations where more detailled interpretation of the selected variables is of central interest. The reliability of true informative value of selected variables is increased by using alternative tuning via permuted variables. Moreover, making use of the parametric bootstrap allows for the first time the calculation of p-values for single effect estimators of gradient boosting algorithms in high dimensional scenarios with correlated predictor variables

MedGAN: Medical Image Translation using GANs

Author: Armanious Karim
Fischer Marc
Gatidis Sergios
Hepp Tobias
Jiang Chenming
Küstner Thomas
Nikolaou Konstantin
Yang Bin
Publication venue: 'Elsevier BV'
Publication date: 04/04/2019
Field of study

Image-to-image translation is considered a new frontier in the field of medical image analysis, with numerous potential applications. However, a large portion of recent approaches offers individualized solutions based on specialized task-specific architectures or require refinement through non-end-to-end training. In this paper, we propose a new framework, named MedGAN, for medical image-to-image translation which operates on the image level in an end-to-end manner. MedGAN builds upon recent advances in the field of generative adversarial networks (GANs) by merging the adversarial framework with a new combination of non-adversarial losses. We utilize a discriminator network as a trainable feature extractor which penalizes the discrepancy between the translated medical images and the desired modalities. Moreover, style-transfer losses are utilized to match the textures and fine-structures of the desired target images to the translated images. Additionally, we present a new generator architecture, titled CasNet, which enhances the sharpness of the translated medical outputs through progressive refinement via encoder-decoder pairs. Without any application-specific modifications, we apply MedGAN on three different tasks: PET-CT translation, correction of MR motion artefacts and PET image denoising. Perceptual analysis by radiologists and quantitative evaluations illustrate that the MedGAN outperforms other existing translation approaches.Comment: 16 pages, 8 figure

arXiv.org e-Print Archive

King's Research Portal

ipA-MedGAN: Inpainting of Arbitrary Regions in Medical Imaging

Author: Abdulatif Sherif
Armanious Karim
Gatidis Sergios
Hepp Tobias
Kumar Vijeth
Yang Bin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/01/2020
Field of study

Local deformations in medical modalities are common phenomena due to a multitude of factors such as metallic implants or limited field of views in magnetic resonance imaging (MRI). Completion of the missing or distorted regions is of special interest for automatic image analysis frameworks to enhance post-processing tasks such as segmentation or classification. In this work, we propose a new generative framework for medical image inpainting, titled ipA-MedGAN. It bypasses the limitations of previous frameworks by enabling inpainting of arbitrary shaped regions without a prior localization of the regions of interest. Thorough qualitative and quantitative comparisons with other inpainting and translational approaches have illustrated the superior performance of the proposed framework for the task of brain MR inpainting.Comment: Submitted to IEEE ICIP 202

arXiv.org e-Print Archive

Crossref

Probing for Sparse and Fast Variable Selection with Model-Based Boosting

Author: Bischl Bernd
Hepp Tobias
Mayr Andreas
Thomas Janek
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of the fitting lies in the need of multiple model fits on slightly altered data (e.g., cross-validation or bootstrap) to find the optimal number of boosting iterations and prevent overfitting. In our proposed approach, we augment the data set with randomly permuted versions of the true variables, so-called shadow variables, and stop the stepwise fitting as soon as such a variable would be added to the model. This allows variable selection in a single fit of the model without requiring further parameter tuning. We show that our probing approach can compete with state-of-the-art selection methods like stability selection in a high-dimensional classification benchmark and apply it on three gene expression data sets

arXiv.org e-Print Archive

Directory of Open Access Journals

Open Access LMU

Associated factors and comorbidities in patients with pyoderma gangrenosum in Germany: a retrospective multicentric analysis in 259 patients

Author: Al Ghazal Philipp
Dill Dorothea
Dissemond Joachim
Eming Sabine
Goerge Tobias
Hepp Julia
Herberger Katharina
Hoff Norman-Philipp
Horn Thomas
Karrer Sigrid
Klode Joachim
Maschke Jan
Rabe Eberhard
Renner Regina
Roth Hannelore
Schaller Joerg
Sick Isabell
Splieth Benno
Stroelin Anke
Wollina Uwe
Zutt Markus
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: Pyoderma gangrenosum (PG) is a rarely diagnosed ulcerative neutrophilic dermatosis with unknown origin that has been poorly characterized in clinical studies so far. Consequently there have been significant discussions about its associated factors and comorbidities. The aim of our multicenter study was to analyze current data from patients in dermatologic wound care centers in Germany in order to describe associated factors and comorbidities in patients with PG. Methods: Retrospective clinical investigation of patients with PG from dermatologic wound care centers in Germany. Results: We received data from 259 patients with PG from 20 different dermatologic wound care centers in Germany. Of these 142 (54.8\%) patients were female, 117 (45.2\%) were male; with an age range of 21 to 95 years, and a mean of 58 years. In our patient population we found 45.6\% with anemia, 44.8\% with endocrine diseases, 12.4\% with internal malignancies, 9.3\% with chronic inflammatory bowel diseases and 4.3\% with elevated creatinine levels. Moreover 25.5\% of all patients had a diabetes mellitus with some aspects of potential association with the metabolic syndrome. Conclusions: Our study describes one of the world's largest populations with PG. Beside the well-known association with chronic bowel diseases and neoplasms, a potentially relevant new aspect is an association with endocrine diseases, in particular the metabolic syndrome, thyroid dysfunctions and renal disorders. Our findings represent clinically relevant new aspects. This may help to describe the patients' characteristics and help to understand the underlying pathophysiology in these often misdiagnosed patients

Crossref

Kölner UniversitätsPublikationsServer

Springer - Publisher Connector

Open Access LMU

PubMed Central

A new attraction-detachment model for explaining flow sliding in clay-rich tephras

Author: Churchman G. J.
Hepp Daniel A.
Jorat M. Ehsan
Kluger Max O.
Kreiter Stefan
Lowe David J.
Moon Vicki G.
Mörz Tobias
Seibel David
Publication venue
Publication date: 01/11/2016
Field of study

Altered pyroclastic (tephra) deposits are highly susceptible to landsliding, leading to fatalities and property damage every year. Halloysite, a low-activity clay mineral, is commonly associated with landslide-prone layers within altered tephra successions, especially in deposits with high sensitivity, which describes the post-failure strength loss. However, the precise role of halloysite in the development of sensitivity, and thus in sudden and unpredictable landsliding, is unknown. Here we show that an abundance of mushroom cap–shaped (MCS) spheroidal halloysite governs the development of sensitivity, and hence proneness to landsliding, in altered rhyolitic tephras, North Island, New Zealand. We found that a highly sensitive layer, which was involved in a flow slide, has a remarkably high content of aggregated MCS spheroids with substantial openings on one side. We suggest that short-range electrostatic and van der Waals interactions enabled the MCS spheroids to form interconnected aggregates by attraction between the edges of numerous paired silanol and aluminol sheets that are exposed in the openings and the convex silanol faces on the exterior surfaces of adjacent MCS spheroids. If these weak attractions are overcome during slope failure, multiple, weakly attracted MCS spheroids can be separated from one another, and the prevailing repulsion between exterior MCS surfaces results in a low remolded shear strength, a high sensitivity, and a high propensity for flow sliding. The evidence indicates that the attraction-detachment model explains the high sensitivity and contributes to an improved understanding of the mechanisms of flow sliding in sensitive, altered tephras rich in spheroidal halloysite

Abertay Research Portal

Adelaide Research & Scholarship

Research Commons@Waikato

Publishing Network for Geoscientific and Environmental Data